Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 18249 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.8 MiB |
| Average record size in memory | 104.0 B |
Variable types
| DateTime | 1 |
|---|---|
| Numeric | 9 |
| Categorical | 3 |
region has a high cardinality: 54 distinct values | High cardinality |
AveragePrice is highly correlated with Total Volume and 6 other fields | High correlation |
Total Volume is highly correlated with AveragePrice and 7 other fields | High correlation |
4046 is highly correlated with AveragePrice and 7 other fields | High correlation |
4225 is highly correlated with AveragePrice and 7 other fields | High correlation |
4770 is highly correlated with AveragePrice and 7 other fields | High correlation |
Total Bags is highly correlated with AveragePrice and 7 other fields | High correlation |
Small Bags is highly correlated with AveragePrice and 7 other fields | High correlation |
Large Bags is highly correlated with AveragePrice and 7 other fields | High correlation |
XLarge Bags is highly correlated with Total Volume and 6 other fields | High correlation |
Total Volume is highly correlated with 4046 and 6 other fields | High correlation |
4046 is highly correlated with Total Volume and 6 other fields | High correlation |
4225 is highly correlated with Total Volume and 6 other fields | High correlation |
4770 is highly correlated with Total Volume and 6 other fields | High correlation |
Total Bags is highly correlated with Total Volume and 6 other fields | High correlation |
Small Bags is highly correlated with Total Volume and 6 other fields | High correlation |
Large Bags is highly correlated with Total Volume and 6 other fields | High correlation |
XLarge Bags is highly correlated with Total Volume and 6 other fields | High correlation |
Total Volume is highly correlated with 4046 and 6 other fields | High correlation |
4046 is highly correlated with Total Volume and 4 other fields | High correlation |
4225 is highly correlated with Total Volume and 4 other fields | High correlation |
4770 is highly correlated with Total Volume and 5 other fields | High correlation |
Total Bags is highly correlated with Total Volume and 6 other fields | High correlation |
Small Bags is highly correlated with Total Volume and 5 other fields | High correlation |
Large Bags is highly correlated with Total Volume and 1 other fields | High correlation |
XLarge Bags is highly correlated with Total Volume and 3 other fields | High correlation |
AveragePrice is highly correlated with type and 1 other fields | High correlation |
Total Volume is highly correlated with 4046 and 7 other fields | High correlation |
4046 is highly correlated with Total Volume and 7 other fields | High correlation |
4225 is highly correlated with Total Volume and 7 other fields | High correlation |
4770 is highly correlated with Total Volume and 7 other fields | High correlation |
Total Bags is highly correlated with Total Volume and 7 other fields | High correlation |
Small Bags is highly correlated with Total Volume and 7 other fields | High correlation |
Large Bags is highly correlated with Total Volume and 7 other fields | High correlation |
XLarge Bags is highly correlated with Total Volume and 6 other fields | High correlation |
type is highly correlated with AveragePrice | High correlation |
region is highly correlated with AveragePrice and 7 other fields | High correlation |
region is uniformly distributed | Uniform |
4046 has 242 (1.3%) zeros | Zeros |
4770 has 5497 (30.1%) zeros | Zeros |
Large Bags has 2370 (13.0%) zeros | Zeros |
XLarge Bags has 12048 (66.0%) zeros | Zeros |
Reproduction
| Analysis started | 2022-02-01 09:55:59.471909 |
|---|---|
| Analysis finished | 2022-02-01 09:56:31.346799 |
| Duration | 31.87 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
Date
Date
| Distinct | 169 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 142.7 KiB |
| Minimum | 2015-01-04 00:00:00 |
|---|---|
| Maximum | 2018-03-25 00:00:00 |
Histogram with fixed size bins (bins=50)
| Distinct | 259 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.40597841 |
| Minimum | 0.44 |
|---|---|
| Maximum | 3.25 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 0.44 |
|---|---|
| 5-th percentile | 0.83 |
| Q1 | 1.1 |
| median | 1.37 |
| Q3 | 1.66 |
| 95-th percentile | 2.11 |
| Maximum | 3.25 |
| Range | 2.81 |
| Interquartile range (IQR) | 0.56 |
Descriptive statistics
| Standard deviation | 0.4026765555 |
|---|---|
| Coefficient of variation (CV) | 0.2864030861 |
| Kurtosis | 0.3251958507 |
| Mean | 1.40597841 |
| Median Absolute Deviation (MAD) | 0.28 |
| Skewness | 0.5803027379 |
| Sum | 25657.7 |
| Variance | 0.1621484083 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1.15 | 202 | 1.1% |
| 1.18 | 199 | 1.1% |
| 1.08 | 194 | 1.1% |
| 1.26 | 193 | 1.1% |
| 1.13 | 192 | 1.1% |
| 0.98 | 189 | 1.0% |
| 1.19 | 188 | 1.0% |
| 1.36 | 187 | 1.0% |
| 1.59 | 186 | 1.0% |
| 1.43 | 185 | 1.0% |
| Other values (249) | 16334 |
| Value | Count | Frequency (%) |
| 0.44 | 1 | < 0.1% |
| 0.46 | 1 | < 0.1% |
| 0.48 | 1 | < 0.1% |
| 0.49 | 2 | < 0.1% |
| 0.51 | 5 | |
| 0.52 | 3 | < 0.1% |
| 0.53 | 6 | |
| 0.54 | 7 | |
| 0.55 | 3 | < 0.1% |
| 0.56 | 12 |
| Value | Count | Frequency (%) |
| 3.25 | 1 | |
| 3.17 | 1 | |
| 3.12 | 1 | |
| 3.05 | 1 | |
| 3.04 | 1 | |
| 3.03 | 1 | |
| 3 | 2 | |
| 2.99 | 2 | |
| 2.97 | 1 | |
| 2.96 | 1 |
| Distinct | 18237 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 850644.013 |
| Minimum | 84.56 |
|---|---|
| Maximum | 62505646.52 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 84.56 |
|---|---|
| 5-th percentile | 2371.862 |
| Q1 | 10838.58 |
| median | 107376.76 |
| Q3 | 432962.29 |
| 95-th percentile | 3716315.41 |
| Maximum | 62505646.52 |
| Range | 62505561.96 |
| Interquartile range (IQR) | 422123.71 |
Descriptive statistics
| Standard deviation | 3453545.355 |
|---|---|
| Coefficient of variation (CV) | 4.059918488 |
| Kurtosis | 92.10445778 |
| Mean | 850644.013 |
| Median Absolute Deviation (MAD) | 102962.47 |
| Skewness | 9.007687479 |
| Sum | 1.552340259 × 1010 |
| Variance | 1.192697552 × 1013 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4103.97 | 2 | < 0.1% |
| 3529.44 | 2 | < 0.1% |
| 46602.16 | 2 | < 0.1% |
| 13234.04 | 2 | < 0.1% |
| 3713.49 | 2 | < 0.1% |
| 19634.24 | 2 | < 0.1% |
| 3288.85 | 2 | < 0.1% |
| 9465.99 | 2 | < 0.1% |
| 2038.99 | 2 | < 0.1% |
| 2858.31 | 2 | < 0.1% |
| Other values (18227) | 18229 |
| Value | Count | Frequency (%) |
| 84.56 | 1 | |
| 379.82 | 1 | |
| 385.55 | 1 | |
| 419.98 | 1 | |
| 472.82 | 1 | |
| 482.26 | 1 | |
| 515.01 | 1 | |
| 530.96 | 1 | |
| 542.85 | 1 | |
| 561.1 | 1 |
| Value | Count | Frequency (%) |
| 62505646.52 | 1 | |
| 61034457.1 | 1 | |
| 52288697.89 | 1 | |
| 47293921.6 | 1 | |
| 46324529.7 | 1 | |
| 44655461.51 | 1 | |
| 43409835.75 | 1 | |
| 43167806.09 | 1 | |
| 42939821.55 | 1 | |
| 42867608.54 | 1 |
| Distinct | 17702 |
|---|---|
| Distinct (%) | 97.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 293008.4245 |
| Minimum | 0 |
|---|---|
| Maximum | 22743616.17 |
| Zeros | 242 |
| Zeros (%) | 1.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 19.6 |
| Q1 | 854.07 |
| median | 8645.3 |
| Q3 | 111020.2 |
| 95-th percentile | 1263359.678 |
| Maximum | 22743616.17 |
| Range | 22743616.17 |
| Interquartile range (IQR) | 110166.13 |
Descriptive statistics
| Standard deviation | 1264989.082 |
|---|---|
| Coefficient of variation (CV) | 4.317244747 |
| Kurtosis | 86.80911256 |
| Mean | 293008.4245 |
| Median Absolute Deviation (MAD) | 8616.69 |
| Skewness | 8.648219757 |
| Sum | 5347110739 |
| Variance | 1.600197377 × 1012 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 242 | 1.3% |
| 3 | 10 | 0.1% |
| 4 | 8 | < 0.1% |
| 1.24 | 8 | < 0.1% |
| 1 | 8 | < 0.1% |
| 1.25 | 7 | < 0.1% |
| 6 | 7 | < 0.1% |
| 1.21 | 6 | < 0.1% |
| 1.3 | 5 | < 0.1% |
| 1.27 | 5 | < 0.1% |
| Other values (17692) | 17943 |
| Value | Count | Frequency (%) |
| 0 | 242 | |
| 1 | 8 | < 0.1% |
| 1.13 | 1 | < 0.1% |
| 1.19 | 3 | < 0.1% |
| 1.2 | 1 | < 0.1% |
| 1.21 | 6 | < 0.1% |
| 1.22 | 5 | < 0.1% |
| 1.23 | 1 | < 0.1% |
| 1.24 | 8 | < 0.1% |
| 1.25 | 7 | < 0.1% |
| Value | Count | Frequency (%) |
| 22743616.17 | 1 | |
| 21620180.9 | 1 | |
| 18933038.04 | 1 | |
| 17787611.93 | 1 | |
| 17076650.82 | 1 | |
| 16573573.78 | 1 | |
| 16529797.6 | 1 | |
| 16383685.07 | 1 | |
| 16215328.75 | 1 | |
| 16000107.8 | 1 |
| Distinct | 18103 |
|---|---|
| Distinct (%) | 99.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 295154.5684 |
| Minimum | 0 |
|---|---|
| Maximum | 20470572.61 |
| Zeros | 61 |
| Zeros (%) | 0.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 103.614 |
| Q1 | 3008.78 |
| median | 29061.02 |
| Q3 | 150206.86 |
| 95-th percentile | 1303657.658 |
| Maximum | 20470572.61 |
| Range | 20470572.61 |
| Interquartile range (IQR) | 147198.08 |
Descriptive statistics
| Standard deviation | 1204120.401 |
|---|---|
| Coefficient of variation (CV) | 4.079626508 |
| Kurtosis | 91.94902197 |
| Mean | 295154.5684 |
| Median Absolute Deviation (MAD) | 28521.3 |
| Skewness | 8.942465608 |
| Sum | 5386275718 |
| Variance | 1.44990594 × 1012 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 61 | 0.3% |
| 177.87 | 3 | < 0.1% |
| 215.36 | 3 | < 0.1% |
| 1.3 | 3 | < 0.1% |
| 1.26 | 3 | < 0.1% |
| 94.74 | 3 | < 0.1% |
| 13.6 | 2 | < 0.1% |
| 20.32 | 2 | < 0.1% |
| 35898.69 | 2 | < 0.1% |
| 6973.51 | 2 | < 0.1% |
| Other values (18093) | 18165 |
| Value | Count | Frequency (%) |
| 0 | 61 | |
| 1.26 | 3 | < 0.1% |
| 1.28 | 2 | < 0.1% |
| 1.3 | 3 | < 0.1% |
| 1.31 | 1 | < 0.1% |
| 1.32 | 2 | < 0.1% |
| 1.64 | 1 | < 0.1% |
| 2.39 | 1 | < 0.1% |
| 2.4 | 1 | < 0.1% |
| 2.48 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 20470572.61 | 1 | |
| 20445501.03 | 1 | |
| 20328161.55 | 1 | |
| 18956479.74 | 1 | |
| 17896391.6 | 1 | |
| 16602589.04 | 1 | |
| 16054083.86 | 1 | |
| 15899858.37 | 1 | |
| 14888077.69 | 1 | |
| 14437190.03 | 1 |
| Distinct | 12071 |
|---|---|
| Distinct (%) | 66.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22839.73599 |
| Minimum | 0 |
|---|---|
| Maximum | 2546439.11 |
| Zeros | 5497 |
| Zeros (%) | 30.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 184.99 |
| Q3 | 6243.42 |
| 95-th percentile | 106156.574 |
| Maximum | 2546439.11 |
| Range | 2546439.11 |
| Interquartile range (IQR) | 6243.42 |
Descriptive statistics
| Standard deviation | 107464.0684 |
|---|---|
| Coefficient of variation (CV) | 4.705136192 |
| Kurtosis | 132.5634409 |
| Mean | 22839.73599 |
| Median Absolute Deviation (MAD) | 184.99 |
| Skewness | 10.15939563 |
| Sum | 416802342.1 |
| Variance | 1.1548526 × 1010 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 5497 | |
| 2.66 | 7 | < 0.1% |
| 3.32 | 7 | < 0.1% |
| 10.97 | 6 | < 0.1% |
| 1.59 | 6 | < 0.1% |
| 1.64 | 6 | < 0.1% |
| 1.6 | 6 | < 0.1% |
| 2.74 | 5 | < 0.1% |
| 1.66 | 5 | < 0.1% |
| 1.18 | 5 | < 0.1% |
| Other values (12061) | 12699 |
| Value | Count | Frequency (%) |
| 0 | 5497 | |
| 0.83 | 1 | < 0.1% |
| 1 | 3 | < 0.1% |
| 1.01 | 1 | < 0.1% |
| 1.09 | 1 | < 0.1% |
| 1.11 | 1 | < 0.1% |
| 1.12 | 1 | < 0.1% |
| 1.15 | 1 | < 0.1% |
| 1.16 | 1 | < 0.1% |
| 1.18 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 2546439.11 | 1 | |
| 1993645.36 | 1 | |
| 1896149.5 | 1 | |
| 1880231.38 | 1 | |
| 1811090.71 | 1 | |
| 1800065.57 | 1 | |
| 1773088.87 | 1 | |
| 1770948.09 | 1 | |
| 1761343.08 | 1 | |
| 1753852.61 | 1 |
| Distinct | 18097 |
|---|---|
| Distinct (%) | 99.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 239639.2021 |
| Minimum | 0 |
|---|---|
| Maximum | 19373134.37 |
| Zeros | 15 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 628.89 |
| Q1 | 5088.64 |
| median | 39743.83 |
| Q3 | 110783.37 |
| 95-th percentile | 1005478.892 |
| Maximum | 19373134.37 |
| Range | 19373134.37 |
| Interquartile range (IQR) | 105694.73 |
Descriptive statistics
| Standard deviation | 986242.3992 |
|---|---|
| Coefficient of variation (CV) | 4.115530309 |
| Kurtosis | 112.2721565 |
| Mean | 239639.2021 |
| Median Absolute Deviation (MAD) | 37299.96 |
| Skewness | 9.75607167 |
| Sum | 4373175798 |
| Variance | 9.7267407 × 1011 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 15 | 0.1% |
| 990 | 5 | < 0.1% |
| 300 | 5 | < 0.1% |
| 550 | 4 | < 0.1% |
| 266.67 | 4 | < 0.1% |
| 916.67 | 4 | < 0.1% |
| 286.67 | 3 | < 0.1% |
| 263.33 | 3 | < 0.1% |
| 196.67 | 3 | < 0.1% |
| 260 | 3 | < 0.1% |
| Other values (18087) | 18200 |
| Value | Count | Frequency (%) |
| 0 | 15 | |
| 3.09 | 1 | < 0.1% |
| 3.11 | 1 | < 0.1% |
| 3.19 | 1 | < 0.1% |
| 3.33 | 1 | < 0.1% |
| 6.14 | 1 | < 0.1% |
| 6.18 | 1 | < 0.1% |
| 6.24 | 1 | < 0.1% |
| 6.36 | 1 | < 0.1% |
| 7.02 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 19373134.37 | 1 | |
| 16394524.11 | 1 | |
| 16298296.29 | 1 | |
| 15972492.07 | 1 | |
| 15804696.31 | 1 | |
| 15102426.94 | 1 | |
| 15051877.14 | 1 | |
| 14894893.8 | 1 | |
| 14504209.37 | 1 | |
| 14440611.5 | 1 |
| Distinct | 17321 |
|---|---|
| Distinct (%) | 94.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 182194.6867 |
| Minimum | 0 |
|---|---|
| Maximum | 13384586.8 |
| Zeros | 159 |
| Zeros (%) | 0.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 256.67 |
| Q1 | 2849.42 |
| median | 26362.82 |
| Q3 | 83337.67 |
| 95-th percentile | 768147.228 |
| Maximum | 13384586.8 |
| Range | 13384586.8 |
| Interquartile range (IQR) | 80488.25 |
Descriptive statistics
| Standard deviation | 746178.515 |
|---|---|
| Coefficient of variation (CV) | 4.095500964 |
| Kurtosis | 107.0128851 |
| Mean | 182194.6867 |
| Median Absolute Deviation (MAD) | 25599.49 |
| Skewness | 9.540659982 |
| Sum | 3324870838 |
| Variance | 5.567823762 × 1011 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 159 | 0.9% |
| 203.33 | 11 | 0.1% |
| 223.33 | 10 | 0.1% |
| 533.33 | 10 | 0.1% |
| 123.33 | 8 | < 0.1% |
| 196.67 | 8 | < 0.1% |
| 70 | 8 | < 0.1% |
| 103.33 | 8 | < 0.1% |
| 216.67 | 8 | < 0.1% |
| 20 | 8 | < 0.1% |
| Other values (17311) | 18011 |
| Value | Count | Frequency (%) |
| 0 | 159 | |
| 2.52 | 1 | < 0.1% |
| 2.57 | 1 | < 0.1% |
| 2.73 | 1 | < 0.1% |
| 2.79 | 1 | < 0.1% |
| 2.95 | 3 | < 0.1% |
| 2.96 | 1 | < 0.1% |
| 3.06 | 1 | < 0.1% |
| 3.09 | 1 | < 0.1% |
| 3.11 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 13384586.8 | 1 | |
| 12567155.58 | 1 | |
| 12540327.19 | 1 | |
| 11712807.19 | 1 | |
| 11392828.89 | 1 | |
| 11228049.63 | 1 | |
| 11112405.61 | 1 | |
| 10844852.22 | 1 | |
| 10832907.44 | 1 | |
| 10666942.78 | 1 |
Large Bags
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 15082 |
|---|---|
| Distinct (%) | 82.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 54338.08814 |
| Minimum | 0 |
|---|---|
| Maximum | 5719096.61 |
| Zeros | 2370 |
| Zeros (%) | 13.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 127.47 |
| median | 2647.71 |
| Q3 | 22029.25 |
| 95-th percentile | 195699.768 |
| Maximum | 5719096.61 |
| Range | 5719096.61 |
| Interquartile range (IQR) | 21901.78 |
Descriptive statistics
| Standard deviation | 243965.9645 |
|---|---|
| Coefficient of variation (CV) | 4.489778218 |
| Kurtosis | 117.999481 |
| Mean | 54338.08814 |
| Median Absolute Deviation (MAD) | 2647.71 |
| Skewness | 9.796454599 |
| Sum | 991615770.5 |
| Variance | 5.951939186 × 1010 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 2370 | 13.0% |
| 3.33 | 187 | 1.0% |
| 6.67 | 78 | 0.4% |
| 10 | 47 | 0.3% |
| 4.44 | 38 | 0.2% |
| 13.33 | 28 | 0.2% |
| 16.67 | 18 | 0.1% |
| 26.67 | 18 | 0.1% |
| 6.66 | 18 | 0.1% |
| 20 | 14 | 0.1% |
| Other values (15072) | 15433 |
| Value | Count | Frequency (%) |
| 0 | 2370 | |
| 0.97 | 1 | < 0.1% |
| 1.3 | 1 | < 0.1% |
| 1.33 | 1 | < 0.1% |
| 1.38 | 2 | < 0.1% |
| 1.44 | 1 | < 0.1% |
| 1.48 | 1 | < 0.1% |
| 1.55 | 1 | < 0.1% |
| 1.56 | 1 | < 0.1% |
| 1.62 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 5719096.61 | 1 | |
| 4324231.19 | 1 | |
| 4081397.72 | 1 | |
| 4023485.04 | 1 | |
| 3988101.74 | 1 | |
| 3917569.95 | 1 | |
| 3789722.9 | 1 | |
| 3618270.75 | 1 | |
| 3544729.39 | 1 | |
| 3434846.78 | 1 |
XLarge Bags
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 5588 |
|---|---|
| Distinct (%) | 30.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3106.426507 |
| Minimum | 0 |
|---|---|
| Maximum | 551693.65 |
| Zeros | 12048 |
| Zeros (%) | 66.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 142.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 132.5 |
| 95-th percentile | 12058.452 |
| Maximum | 551693.65 |
| Range | 551693.65 |
| Interquartile range (IQR) | 132.5 |
Descriptive statistics
| Standard deviation | 17692.89465 |
|---|---|
| Coefficient of variation (CV) | 5.695578058 |
| Kurtosis | 233.6026119 |
| Mean | 3106.426507 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 13.13975069 |
| Sum | 56689177.33 |
| Variance | 313038521.2 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 12048 | |
| 3.33 | 29 | 0.2% |
| 6.67 | 16 | 0.1% |
| 1.11 | 15 | 0.1% |
| 5 | 12 | 0.1% |
| 10 | 9 | < 0.1% |
| 16.67 | 8 | < 0.1% |
| 2.22 | 7 | < 0.1% |
| 150 | 6 | < 0.1% |
| 13.33 | 6 | < 0.1% |
| Other values (5578) | 6093 |
| Value | Count | Frequency (%) |
| 0 | 12048 | |
| 1 | 1 | < 0.1% |
| 1.11 | 15 | 0.1% |
| 1.26 | 1 | < 0.1% |
| 1.3 | 1 | < 0.1% |
| 1.38 | 1 | < 0.1% |
| 1.41 | 2 | < 0.1% |
| 1.45 | 1 | < 0.1% |
| 1.47 | 4 | < 0.1% |
| 1.49 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 551693.65 | 1 | |
| 454343.65 | 1 | |
| 390478.73 | 1 | |
| 387400.22 | 1 | |
| 377661.06 | 1 | |
| 373523.47 | 1 | |
| 347390.14 | 1 | |
| 328589.09 | 1 | |
| 326348.15 | 1 | |
| 321033.23 | 1 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 142.7 KiB |
| conventional | |
|---|---|
| organic |
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 9.500410981 |
| Min length | 7 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | conventional |
|---|---|
| 2nd row | conventional |
| 3rd row | conventional |
| 4th row | conventional |
| 5th row | conventional |
Common Values
| Value | Count | Frequency (%) |
| conventional | 9126 | |
| organic | 9123 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| conventional | 9126 | |
| organic | 9123 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
year
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 142.7 KiB |
| 2017 | |
|---|---|
| 2016 | |
| 2015 | |
| 2018 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2015 |
|---|---|
| 2nd row | 2015 |
| 3rd row | 2015 |
| 4th row | 2015 |
| 5th row | 2015 |
Common Values
| Value | Count | Frequency (%) |
| 2017 | 5722 | |
| 2016 | 5616 | |
| 2015 | 5615 | |
| 2018 | 1296 | 7.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 2017 | 5722 | |
| 2016 | 5616 | |
| 2015 | 5615 | |
| 2018 | 1296 | 7.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 54 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 142.7 KiB |
| Albany | 338 |
|---|---|
| Sacramento | 338 |
| Northeast | 338 |
| NorthernNewEngland | 338 |
| Orlando | 338 |
| Other values (49) |
Length
| Max length | 19 |
|---|---|
| Median length | 9 |
| Mean length | 10.29535865 |
| Min length | 4 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Albany |
|---|---|
| 2nd row | Albany |
| 3rd row | Albany |
| 4th row | Albany |
| 5th row | Albany |
Common Values
| Value | Count | Frequency (%) |
| Albany | 338 | 1.9% |
| Sacramento | 338 | 1.9% |
| Northeast | 338 | 1.9% |
| NorthernNewEngland | 338 | 1.9% |
| Orlando | 338 | 1.9% |
| Philadelphia | 338 | 1.9% |
| PhoenixTucson | 338 | 1.9% |
| Pittsburgh | 338 | 1.9% |
| Plains | 338 | 1.9% |
| Portland | 338 | 1.9% |
| Other values (44) | 14869 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| albany | 338 | 1.9% |
| denver | 338 | 1.9% |
| midsouth | 338 | 1.9% |
| baltimorewashington | 338 | 1.9% |
| boise | 338 | 1.9% |
| boston | 338 | 1.9% |
| buffalorochester | 338 | 1.9% |
| california | 338 | 1.9% |
| charlotte | 338 | 1.9% |
| chicago | 338 | 1.9% |
| Other values (44) | 14869 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| Date | AveragePrice | Total Volume | 4046 | 4225 | 4770 | Total Bags | Small Bags | Large Bags | XLarge Bags | type | year | region | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2015-12-27 | 1.33 | 64236.62 | 1036.74 | 54454.85 | 48.16 | 8696.87 | 8603.62 | 93.25 | 0.0 | conventional | 2015 | Albany |
| 1 | 2015-12-20 | 1.35 | 54876.98 | 674.28 | 44638.81 | 58.33 | 9505.56 | 9408.07 | 97.49 | 0.0 | conventional | 2015 | Albany |
| 2 | 2015-12-13 | 0.93 | 118220.22 | 794.70 | 109149.67 | 130.50 | 8145.35 | 8042.21 | 103.14 | 0.0 | conventional | 2015 | Albany |
| 3 | 2015-12-06 | 1.08 | 78992.15 | 1132.00 | 71976.41 | 72.58 | 5811.16 | 5677.40 | 133.76 | 0.0 | conventional | 2015 | Albany |
| 4 | 2015-11-29 | 1.28 | 51039.60 | 941.48 | 43838.39 | 75.78 | 6183.95 | 5986.26 | 197.69 | 0.0 | conventional | 2015 | Albany |
| 5 | 2015-11-22 | 1.26 | 55979.78 | 1184.27 | 48067.99 | 43.61 | 6683.91 | 6556.47 | 127.44 | 0.0 | conventional | 2015 | Albany |
| 6 | 2015-11-15 | 0.99 | 83453.76 | 1368.92 | 73672.72 | 93.26 | 8318.86 | 8196.81 | 122.05 | 0.0 | conventional | 2015 | Albany |
| 7 | 2015-11-08 | 0.98 | 109428.33 | 703.75 | 101815.36 | 80.00 | 6829.22 | 6266.85 | 562.37 | 0.0 | conventional | 2015 | Albany |
| 8 | 2015-11-01 | 1.02 | 99811.42 | 1022.15 | 87315.57 | 85.34 | 11388.36 | 11104.53 | 283.83 | 0.0 | conventional | 2015 | Albany |
| 9 | 2015-10-25 | 1.07 | 74338.76 | 842.40 | 64757.44 | 113.00 | 8625.92 | 8061.47 | 564.45 | 0.0 | conventional | 2015 | Albany |
Last rows
| Date | AveragePrice | Total Volume | 4046 | 4225 | 4770 | Total Bags | Small Bags | Large Bags | XLarge Bags | type | year | region | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 18239 | 2018-03-11 | 1.56 | 22128.42 | 2162.67 | 3194.25 | 8.93 | 16762.57 | 16510.32 | 252.25 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18240 | 2018-03-04 | 1.54 | 17393.30 | 1832.24 | 1905.57 | 0.00 | 13655.49 | 13401.93 | 253.56 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18241 | 2018-02-25 | 1.57 | 18421.24 | 1974.26 | 2482.65 | 0.00 | 13964.33 | 13698.27 | 266.06 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18242 | 2018-02-18 | 1.56 | 17597.12 | 1892.05 | 1928.36 | 0.00 | 13776.71 | 13553.53 | 223.18 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18243 | 2018-02-11 | 1.57 | 15986.17 | 1924.28 | 1368.32 | 0.00 | 12693.57 | 12437.35 | 256.22 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18244 | 2018-02-04 | 1.63 | 17074.83 | 2046.96 | 1529.20 | 0.00 | 13498.67 | 13066.82 | 431.85 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18245 | 2018-01-28 | 1.71 | 13888.04 | 1191.70 | 3431.50 | 0.00 | 9264.84 | 8940.04 | 324.80 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18246 | 2018-01-21 | 1.87 | 13766.76 | 1191.92 | 2452.79 | 727.94 | 9394.11 | 9351.80 | 42.31 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18247 | 2018-01-14 | 1.93 | 16205.22 | 1527.63 | 2981.04 | 727.01 | 10969.54 | 10919.54 | 50.00 | 0.0 | organic | 2018 | WestTexNewMexico |
| 18248 | 2018-01-07 | 1.62 | 17489.58 | 2894.77 | 2356.13 | 224.53 | 12014.15 | 11988.14 | 26.01 | 0.0 | organic | 2018 | WestTexNewMexico |